Lab Assignment Six: CNNs¶

Course: Maching Learning in Python¶

Semester: Fall 2023¶

Team: Perceptron¶

Authors¶

Brian Mendes (49243148)

Prashant Iyer (49352530)

Ritik K (49347408)


Data Sources:¶

Vehicle Detection Image Dataset Kaggle - https://www.kaggle.com/datasets/brsdincer/vehicle-detection-image-set

Business Understanding:¶

Vehicular detection plays a crucial role in managing traffic in an urban area. It is a critical part of urban planning, transportation and safety. Having a system that categorizes images as a vehicle or non vehicle,helps to enhances the system that helps in better planning to avoid congestion of roads especially in the peak/rush hours.

There are main objectives in traffic management with respect to de-congestion that can be fulfiled by a Vehicular Detection Algorithm:

De-congesting Roads: This model can help to monitor the hotspots and bottlenecks. Collecting this information can help in creating alterate strategies like routing options, lane control, building crossovers etc. to avoid these problems.

Traffic Control: This model can be implemented in a Smart traffic control systems. Detecting vehicles at every intersection can help adjust traffic signals in a real time, alter waiting times and improving the traffic flow in a real time traffic scenario.

The base for this Smart City initiative for effecient traffic management is detecting if an object is a vehicle or not. Having an accuracy better than a random choice is a positive step to making this system come True. This contributes to a efficient and safe urban environments. We would try to maintain and establish a balance between rightly classified vehicles, avoiding the wrong classsification of vehicle and maintain an accuracy

Problem Addressed: How accurately can the system classify the vehicles and non-vehicles? Having an accuracy of above 90% would do wonders in addressing and improving the Traffic Management strategies

Evaluation Metric¶

Our main task is to distinguish and accurately identify the vehicles from the dataset. Given the nature of this problem, along with improving accuracy, we must also ensure to minimize the error rates. Our main goal is to judge the models based on the balance that they maintain between the correct and incorrect classification of data.

In tasks like vehicle detection, false positives (misclassifying a non-vehicle as a vehicle) and false negatives (misclassifying a vehicle as a non-vehicle) may have different costs or implications. Thus, to mainatain this balance, F1 score would be the primary source of truth to judge any model. The F1 score takes into account both kinds of errors and aids in striking a compromise that lowers the overall expense. It provides a holistic view of the model's performance when classifying the data.

The F1 score is the harmonic mean of precision and recall. It provides a balance between these two metrics. In vehicle detection, precision is about correctly identifying vehicles among all instances classified as vehicles, and recall is about correctly identifying all vehicles among all actual vehicles. F1 score is a good compromise when both precision and recall are important, as it penalizes models that favor one at the expense of the other. It would be supported with Accuracy scores and Confusion Matrix to justify the results

In [1]:
#Import the required libraries

from PIL import Image
import numpy as np
import pandas as pd
import os
import random
import warnings
warnings.filterwarnings("ignore")
import matplotlib.pyplot as plt
%matplotlib inline

import tensorflow as tf
import tensorflow.keras as keras
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Reshape, Input
from tensorflow.keras.layers import Dense, Dropout, Activation, Flatten
from tensorflow.keras.layers import Conv2D, MaxPooling2D
from tensorflow.keras.preprocessing.image import ImageDataGenerator
from tensorflow.keras.layers import RandomFlip, RandomRotation
from tensorflow.keras.callbacks import EarlyStopping
from tensorflow.keras.regularizers import l2
from tensorflow.keras.layers import average 
from tensorflow.keras.models import  Model
from tensorflow.keras.utils import to_categorical
from tensorflow.keras.optimizers import Adam
In [2]:
#Datasets
vehicles_dataset = 'vehicles'
non_vehicles_dataset = 'non-vehicles'

# Create an empty list to store the 1D arrays
transformed_images = []
labels = []  # Create an empty list to store the labels

# Function to convert images and assign labels
def convert_images(dataset_path, label):
    images = []
    for filename in os.listdir(dataset_path):
        if filename.endswith(".png"):
            image_path = os.path.join(dataset_path, filename)
            with Image.open(image_path) as img:
                # Convert the image to grayscale
                grayscale_img = img.convert("L")  # "L" mode stands for grayscale
                grayscale_array = np.array(grayscale_img)
                transformed_image = grayscale_array.flatten()  # Flatten the 2D array to 1D
                images.append(transformed_image)
                labels.append(label)  # Append the label
    return images

# Load and flatten images from both datasets with labels
vehicles = convert_images(vehicles_dataset, label=1)  #label = 1; vehicles
non_vehicles = convert_images(non_vehicles_dataset, label=0) #label = 0; non-vehicles 

# Merge both the datasets
transformed_images.extend(vehicles)
transformed_images.extend(non_vehicles)

# Combine the images and labels into a single list and shuffle
combined_data = list(zip(transformed_images, labels))
random.shuffle(combined_data)


#Split the combined data back into images and labels
transformed_images, labels = zip(*combined_data)


# Convert to numpy array
grayscale_images = np.array(transformed_images)
In [3]:
#length of every image vector - 64x64
len(grayscale_images)
Out[3]:
17760
In [4]:
#Min-Max Scaling on the image vector
from sklearn.preprocessing import MinMaxScaler

scaler = MinMaxScaler()

if len(grayscale_images.shape) == 1:
    grayscale_images = grayscale_images.reshape(-1, 1)

# Fit the scaler on your data and transform it
scaled_images = scaler.fit_transform(grayscale_images)

# If you want to retain it as a numpy array (optional)
grayscale_images = np.array(scaled_images)
In [5]:
#Count the number of vehicles and non-vehicle images in the dataset
from collections import Counter

Counter(labels).keys() # equals to list(set(words))
Counter(labels).values()
Out[5]:
dict_values([8968, 8792])
In [6]:
#Function to reshape the image to their original pixel values and display

def plot_gallery(images, titles, h, w, num_images):
    """Helper function to plot a gallery of portraits"""
    plt.figure(figsize=(15,5))
    plt.subplots_adjust(bottom=0, left=.01, right=.99, top=.90, hspace=.35)
    for i in range(num_images):
        plt.subplot(1, num_images, i + 1)
        image = images[i].reshape((h, w))  # Assuming each image is hxw pixels
        label = "Vehicle" if labels[i] == 1 else "Non-Vehicle"
        plt.imshow(image, cmap='gray')
        plt.title(label)
        plt.axis('off')
In [7]:
#Display 10 images present in the dataset
plot_gallery(grayscale_images, labels, 64, 64,10)
In [8]:
grayscale_images[1]
Out[8]:
array([0.13333333, 0.14901961, 0.23529412, ..., 0.25490196, 0.25490196,
       0.25882353])

Standard PCA¶

The purpose of PCA here was to reduce the dimensionality of the image data while retaining as much variance (information) as possible.

In [9]:
%%time

import time
from sklearn.decomposition import PCA

#PCA for dimension reduction
start_time = time.time()

# Number of principal components (eigenvectors) to be considered
n_components = 50  

#Define the original dimensions of the image
h = 64
w = 64

# Create a PCA instance and fit it to your data
pca = PCA(n_components=n_components)
pca.fit(grayscale_images)

pca_data = pca.transform(grayscale_images)

reconstructed_data_pca = pca.inverse_transform(pca_data)

pca_time = time.time() - start_time
print(pca_time)

plot_gallery(reconstructed_data_pca, labels, 64, 64,10)
10.100260019302368
CPU times: user 1min 2s, sys: 16.3 s, total: 1min 19s
Wall time: 10.3 s
In [10]:
def plot_explained_variance(pca):
    import plotly
    from plotly.graph_objs import Bar, Line
    from plotly.graph_objs import Scatter, Layout
    from plotly.graph_objs.scatter import Marker
    from plotly.graph_objs.layout import XAxis, YAxis
    plotly.offline.init_notebook_mode() # run at the start of every notebook
    
    explained_var = pca.explained_variance_ratio_
    cum_var_exp = np.cumsum(explained_var)
    
    plotly.offline.iplot({
        "data": [Bar(y=explained_var, name='individual explained variance'),
                 Scatter(y=cum_var_exp, name='cumulative explained variance')
            ],
        "layout": Layout(xaxis=XAxis(title='Principal components'), yaxis=YAxis(title='Explained variance ratio'))
    })

plot_explained_variance(pca)

After plotting the explained variances, it becomes evident that not all components are equally important. The first few components explain a significant amount of variance, while the latter ones add diminishing returns.

With a cumulative variance plot, We have set an arbitrary threshold (85%) to see how many components are needed to reach that level of explained variance. This approach helps determine the trade-off between dimensionality and information retention

Using PCA, it's possible to significantly reduce the dimensionality of the dataset without a drastic loss in information. Specifically, 50 principal components can represent ~85% of the variance in the original dataset. Post this point, there isn't any significant increase in the variance values until a lot of dimensions are added. The curve flattens out. This reduced dimensionality can make subsequent analyses or model training more efficient, both in terms of computational resources and time, while still capturing the majority of the patterns in the data. This effect of this redution on the model can be concluded only after testing the out the results.

In [11]:
reconstructed_data_pca[1]
X = reconstructed_data_pca.reshape((reconstructed_data_pca.shape[0],64,64,1)) # reshape as images
y = labels
In [12]:
unique_values, counts = np.unique(y, return_counts=True)
In [13]:
for value, count in zip(unique_values, counts):
    print(f"{value}: {count}")
0: 8968
1: 8792
In [14]:
from sklearn.model_selection import ShuffleSplit
import numpy as np

# Ensure X and y are NumPy arrays
X = np.array(X)
y = np.array(y)

# Assuming X is your image data and y is your labels
# Adjust the test_size and random_state as needed
shuffle_split = ShuffleSplit(n_splits=1, test_size=0.2, random_state=42)

# Ensure y is 1D
if len(y.shape) > 1:
    y = y.ravel()

for train_index, test_index in shuffle_split.split(X):
    X_train, X_test = X[train_index], X[test_index]
    y_train, y_test = y[train_index], y[test_index]

Why Shuffle Split: Shuffle Split randomly shuffles the data and splits it into training and testing sets. In our dataset, as class distribution is relatively balanced, shuffle split can be an efficient choice. When we use shuffle split, it introduces randomness in the ordering of the dataset which ensures a representative mix of the data. In a balanced dataset like this, it prevents any bias that might be in place due to ordering. It helps reduced the sensitivity factor that might occcur due to the potential variation in data distribution.

Shuffle Split eliminates any risk of overfitting in the dataset. It enhances the ability of the model on unseen data. It boosts the evaluation as any ordering related dependenices in the dataset have been already elinminated.

CNN Architectures¶

We have buit 2 CNN architectures with 2 Models in every architecture having different hyper parameters in order to understand and judge the performance of the model in a much hoilistic view that covers any biases occuring in the model due to any given parameter.

Data Expansion Techniques: We used ImageDataGenerator for data expansion. This data augmentation method is suitable for vehicle detection task as they capture the variations in appearance of vechicles in the images. This technique help enhance the ability of the model to classify unseen data by exposing it to diverse range of transformations that might be similar to the real world. It covers various parameters like orientation of the image, position of the image within any frame, zoom in versions, varied angles of the image and also fill in any newly identified pixel with its nearest neighbours. This helps keep a check on the maximum possible combinations of the data when training in order to perform to the best of the capability when put into execution.

In [15]:
#Architecture 1

NUM_CLASSES = 2
img_wh = 64
y_ohe = to_categorical(y, NUM_CLASSES)

# Split the data using ShuffleSplit
shuffle_split = ShuffleSplit(n_splits=1, test_size=0.2, random_state=42)

for train_index, test_index in shuffle_split.split(X):
    X_train, X_test = X[train_index], X[test_index]
    y_train_ohe, y_test_ohe = y_ohe[train_index], y_ohe[test_index]

# Data augmentation
datagen = ImageDataGenerator(
    rotation_range=10,
    width_shift_range=0.2,
    height_shift_range=0.2,
    shear_range=0.2,
    zoom_range=0.2,
    horizontal_flip=True,
    fill_mode='nearest'
)

# Function to create a CNN model
def create_model(filters, kernel_regularizer, dropout_rate):
    optimizer = Adam(learning_rate=0.0001)
    model = Sequential()
    model.add(Conv2D(filters=filters,
                    input_shape=(img_wh, img_wh, 1),
                    kernel_size=(3, 3),
                    kernel_initializer='he_uniform',
                    kernel_regularizer=kernel_regularizer,
                    padding='same',
                    activation='relu',
                    data_format="channels_last")) 

    model.add(MaxPooling2D(pool_size=(2, 2), data_format="channels_last"))

    model.add(Flatten())
    model.add(Dropout(dropout_rate))
    model.add(Dense(NUM_CLASSES, 
                    activation='softmax', 
                    kernel_initializer='glorot_uniform',
                    kernel_regularizer=kernel_regularizer))
    model.compile(optimizer=optimizer,
               loss='categorical_crossentropy',
               metrics=['accuracy'])
    model.summary()
    
    return model
In [16]:
# Define parameters for model variations
params_cnn1 = {'filters': 32, 'kernel_regularizer': 'l2', 'dropout_rate': 0.25}
params_cnn2 = {'filters': 64, 'kernel_regularizer': 'l2', 'dropout_rate': 0.5}
cnn1 = create_model(**params_cnn1)
cnn2 = create_model(**params_cnn2)
WARNING:absl:At this time, the v2.11+ optimizer `tf.keras.optimizers.Adam` runs slowly on M1/M2 Macs, please use the legacy Keras optimizer instead, located at `tf.keras.optimizers.legacy.Adam`.
Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 conv2d (Conv2D)             (None, 64, 64, 32)        320       
                                                                 
 max_pooling2d (MaxPooling2  (None, 32, 32, 32)        0         
 D)                                                              
                                                                 
 flatten (Flatten)           (None, 32768)             0         
                                                                 
 dropout (Dropout)           (None, 32768)             0         
                                                                 
 dense (Dense)               (None, 2)                 65538     
                                                                 
=================================================================
Total params: 65858 (257.26 KB)
Trainable params: 65858 (257.26 KB)
Non-trainable params: 0 (0.00 Byte)
_________________________________________________________________
WARNING:absl:At this time, the v2.11+ optimizer `tf.keras.optimizers.Adam` runs slowly on M1/M2 Macs, please use the legacy Keras optimizer instead, located at `tf.keras.optimizers.legacy.Adam`.
Model: "sequential_1"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 conv2d_1 (Conv2D)           (None, 64, 64, 64)        640       
                                                                 
 max_pooling2d_1 (MaxPoolin  (None, 32, 32, 64)        0         
 g2D)                                                            
                                                                 
 flatten_1 (Flatten)         (None, 65536)             0         
                                                                 
 dropout_1 (Dropout)         (None, 65536)             0         
                                                                 
 dense_1 (Dense)             (None, 2)                 131074    
                                                                 
=================================================================
Total params: 131714 (514.51 KB)
Trainable params: 131714 (514.51 KB)
Non-trainable params: 0 (0.00 Byte)
_________________________________________________________________
In [17]:
history_cnn1 = cnn1.fit_generator(datagen.flow(X_train, y_train_ohe, batch_size=128),  
                                  epochs=10, verbose=1,
                                  validation_data=(X_test, y_test_ohe)
                                )

history_cnn2 = cnn2.fit_generator(datagen.flow(X_train, y_train_ohe, batch_size=128), 
                                  epochs=10, verbose=1,
                                  validation_data=(X_test, y_test_ohe)
                                 )
Epoch 1/10
111/111 [==============================] - 5s 45ms/step - loss: 1.2623 - accuracy: 0.7081 - val_loss: 1.0922 - val_accuracy: 0.8533
Epoch 2/10
111/111 [==============================] - 5s 47ms/step - loss: 1.1297 - accuracy: 0.7891 - val_loss: 1.0094 - val_accuracy: 0.8573
Epoch 3/10
111/111 [==============================] - 5s 48ms/step - loss: 1.0581 - accuracy: 0.8194 - val_loss: 0.9958 - val_accuracy: 0.8550
Epoch 4/10
111/111 [==============================] - 5s 42ms/step - loss: 1.0150 - accuracy: 0.8402 - val_loss: 0.9395 - val_accuracy: 0.8758
Epoch 5/10
111/111 [==============================] - 5s 43ms/step - loss: 0.9906 - accuracy: 0.8378 - val_loss: 0.9152 - val_accuracy: 0.8820
Epoch 6/10
111/111 [==============================] - 5s 43ms/step - loss: 0.9669 - accuracy: 0.8438 - val_loss: 0.9290 - val_accuracy: 0.8618
Epoch 7/10
111/111 [==============================] - 5s 41ms/step - loss: 0.9429 - accuracy: 0.8479 - val_loss: 0.9150 - val_accuracy: 0.8620
Epoch 8/10
111/111 [==============================] - 5s 41ms/step - loss: 0.9281 - accuracy: 0.8420 - val_loss: 0.8757 - val_accuracy: 0.8761
Epoch 9/10
111/111 [==============================] - 5s 43ms/step - loss: 0.9057 - accuracy: 0.8487 - val_loss: 0.8597 - val_accuracy: 0.8739
Epoch 10/10
111/111 [==============================] - 5s 41ms/step - loss: 0.8943 - accuracy: 0.8420 - val_loss: 0.8313 - val_accuracy: 0.8787
Epoch 1/10
111/111 [==============================] - 8s 74ms/step - loss: 1.9480 - accuracy: 0.6710 - val_loss: 1.8019 - val_accuracy: 0.7325
Epoch 2/10
111/111 [==============================] - 9s 83ms/step - loss: 1.7739 - accuracy: 0.7708 - val_loss: 1.6004 - val_accuracy: 0.8646
Epoch 3/10
111/111 [==============================] - 8s 75ms/step - loss: 1.6801 - accuracy: 0.8091 - val_loss: 1.6122 - val_accuracy: 0.8440
Epoch 4/10
111/111 [==============================] - 9s 76ms/step - loss: 1.6070 - accuracy: 0.8296 - val_loss: 1.5019 - val_accuracy: 0.8787
Epoch 5/10
111/111 [==============================] - 8s 73ms/step - loss: 1.5578 - accuracy: 0.8407 - val_loss: 1.4571 - val_accuracy: 0.8815
Epoch 6/10
111/111 [==============================] - 8s 73ms/step - loss: 1.5150 - accuracy: 0.8419 - val_loss: 1.4736 - val_accuracy: 0.8581
Epoch 7/10
111/111 [==============================] - 9s 77ms/step - loss: 1.4786 - accuracy: 0.8409 - val_loss: 1.4369 - val_accuracy: 0.8620
Epoch 8/10
111/111 [==============================] - 8s 74ms/step - loss: 1.4370 - accuracy: 0.8483 - val_loss: 1.3614 - val_accuracy: 0.8894
Epoch 9/10
111/111 [==============================] - 8s 75ms/step - loss: 1.4054 - accuracy: 0.8489 - val_loss: 1.3529 - val_accuracy: 0.8781
Epoch 10/10
111/111 [==============================] - 8s 72ms/step - loss: 1.3684 - accuracy: 0.8547 - val_loss: 1.3167 - val_accuracy: 0.8818
In [18]:
def plot_history(history, title):
    plt.figure(figsize=(12, 6))

    # Plot training & validation accuracy values
    plt.subplot(1, 2, 1)
    plt.plot(history.history['accuracy'])
    plt.plot(history.history['val_accuracy'])
    plt.title(title + ' - Model Accuracy')
    plt.xlabel('Epoch')
    plt.ylabel('Accuracy')
    plt.legend(['Train', 'Test'], loc='upper left')

    # Plot training & validation loss values
    plt.subplot(1, 2, 2)
    plt.plot(history.history['loss'])
    plt.plot(history.history['val_loss'])
    plt.title(title + ' - Model Loss')
    plt.xlabel('Epoch')
    plt.ylabel('Loss')
    plt.legend(['Train', 'Test'], loc='upper left')

    plt.show()

# Visualize the performance of the models
plot_history(history_cnn1, 'Model 1')
plot_history(history_cnn2, 'Model 2')
In [19]:
#Architecture 2

import tensorflow as tf
from tensorflow.keras import layers, models
from sklearn.model_selection import ShuffleSplit
from tensorflow.keras.preprocessing.image import ImageDataGenerator
from tensorflow.keras.utils import to_categorical


# Function to create the model
def create_model(filter1, filter2):
    model = models.Sequential()
    model.add(layers.Conv2D(filter1, (3, 3), activation='relu', input_shape=(64, 64, 1)))
    model.add(layers.MaxPooling2D((2, 2)))
    model.add(layers.Conv2D(filter2, (3, 3), activation='relu'))
    model.add(layers.MaxPooling2D((2, 2)))
    model.add(layers.Conv2D(filter2, (3, 3), activation='relu'))
    model.add(layers.Flatten())
    model.add(layers.Dense(64, activation='relu'))
    model.add(layers.Dense(2, activation='softmax'))

    # Compile the model
    model.compile(optimizer='adam',
                  loss='sparse_categorical_crossentropy',
                  metrics=['accuracy'])
    
    model.summary()
    
    return model


# Data augmentation
datagen = ImageDataGenerator(
    rotation_range=10,
    width_shift_range=0.2,
    height_shift_range=0.2,
    shear_range=0.2,
    zoom_range=0.2,
    horizontal_flip=True,
    fill_mode='nearest'
)

# Convert labels to one-hot encoding
NUM_CLASSES = 2
img_wh = 64
y_ohe = to_categorical(y, NUM_CLASSES)

# Split the data using ShuffleSplit
shuffle_split = ShuffleSplit(n_splits=1, test_size=0.2, random_state=42)

for train_index, test_index in shuffle_split.split(X):
    X_train, X_test = X[train_index], X[test_index]
    y_train_ohe, y_test_ohe = y_ohe[train_index], y_ohe[test_index]

# Create the model
# Example: create a model with 64 filters for the first layer and 128 filters for the second layer
cnn3 = create_model(filter1=64, filter2=128)
cnn4 = create_model(filter1=32, filter2=64)

# Train the model with data augmentation
history_cnn3 = cnn3.fit_generator(datagen.flow(X_train, y_train, batch_size=128),
                                  epochs=10, verbose=1,
                                  validation_data=(X_test, y_test))
history_cnn4 = cnn4.fit_generator(datagen.flow(X_train, y_train, batch_size=128),
                                  epochs=5, verbose=1,
                                  validation_data=(X_test, y_test))
Model: "sequential_2"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 conv2d_2 (Conv2D)           (None, 62, 62, 64)        640       
                                                                 
 max_pooling2d_2 (MaxPoolin  (None, 31, 31, 64)        0         
 g2D)                                                            
                                                                 
 conv2d_3 (Conv2D)           (None, 29, 29, 128)       73856     
                                                                 
 max_pooling2d_3 (MaxPoolin  (None, 14, 14, 128)       0         
 g2D)                                                            
                                                                 
 conv2d_4 (Conv2D)           (None, 12, 12, 128)       147584    
                                                                 
 flatten_2 (Flatten)         (None, 18432)             0         
                                                                 
 dense_2 (Dense)             (None, 64)                1179712   
                                                                 
 dense_3 (Dense)             (None, 2)                 130       
                                                                 
=================================================================
Total params: 1401922 (5.35 MB)
Trainable params: 1401922 (5.35 MB)
Non-trainable params: 0 (0.00 Byte)
_________________________________________________________________
Model: "sequential_3"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 conv2d_5 (Conv2D)           (None, 62, 62, 32)        320       
                                                                 
 max_pooling2d_4 (MaxPoolin  (None, 31, 31, 32)        0         
 g2D)                                                            
                                                                 
 conv2d_6 (Conv2D)           (None, 29, 29, 64)        18496     
                                                                 
 max_pooling2d_5 (MaxPoolin  (None, 14, 14, 64)        0         
 g2D)                                                            
                                                                 
 conv2d_7 (Conv2D)           (None, 12, 12, 64)        36928     
                                                                 
 flatten_3 (Flatten)         (None, 9216)              0         
                                                                 
 dense_4 (Dense)             (None, 64)                589888    
                                                                 
 dense_5 (Dense)             (None, 2)                 130       
                                                                 
=================================================================
Total params: 645762 (2.46 MB)
Trainable params: 645762 (2.46 MB)
Non-trainable params: 0 (0.00 Byte)
_________________________________________________________________
Epoch 1/10
111/111 [==============================] - 33s 298ms/step - loss: 0.4198 - accuracy: 0.7935 - val_loss: 0.2728 - val_accuracy: 0.8798
Epoch 2/10
111/111 [==============================] - 32s 283ms/step - loss: 0.2972 - accuracy: 0.8694 - val_loss: 0.2700 - val_accuracy: 0.8677
Epoch 3/10
111/111 [==============================] - 31s 282ms/step - loss: 0.2669 - accuracy: 0.8838 - val_loss: 0.2679 - val_accuracy: 0.8854
Epoch 4/10
111/111 [==============================] - 32s 284ms/step - loss: 0.2302 - accuracy: 0.9034 - val_loss: 0.1855 - val_accuracy: 0.9223
Epoch 5/10
111/111 [==============================] - 32s 284ms/step - loss: 0.2108 - accuracy: 0.9134 - val_loss: 0.1515 - val_accuracy: 0.9381
Epoch 6/10
111/111 [==============================] - 32s 285ms/step - loss: 0.1935 - accuracy: 0.9215 - val_loss: 0.1498 - val_accuracy: 0.9445
Epoch 7/10
111/111 [==============================] - 32s 286ms/step - loss: 0.1857 - accuracy: 0.9241 - val_loss: 0.1338 - val_accuracy: 0.9420
Epoch 8/10
111/111 [==============================] - 32s 286ms/step - loss: 0.1690 - accuracy: 0.9314 - val_loss: 0.1482 - val_accuracy: 0.9445
Epoch 9/10
111/111 [==============================] - 32s 288ms/step - loss: 0.1702 - accuracy: 0.9308 - val_loss: 0.1610 - val_accuracy: 0.9406
Epoch 10/10
111/111 [==============================] - 32s 286ms/step - loss: 0.1514 - accuracy: 0.9399 - val_loss: 0.1295 - val_accuracy: 0.9507
Epoch 1/5
111/111 [==============================] - 12s 111ms/step - loss: 0.4005 - accuracy: 0.8076 - val_loss: 0.3140 - val_accuracy: 0.8629
Epoch 2/5
111/111 [==============================] - 12s 112ms/step - loss: 0.2903 - accuracy: 0.8701 - val_loss: 0.2144 - val_accuracy: 0.9110
Epoch 3/5
111/111 [==============================] - 12s 110ms/step - loss: 0.2508 - accuracy: 0.8868 - val_loss: 0.2416 - val_accuracy: 0.9034
Epoch 4/5
111/111 [==============================] - 12s 110ms/step - loss: 0.2362 - accuracy: 0.8984 - val_loss: 0.2191 - val_accuracy: 0.9074
Epoch 5/5
111/111 [==============================] - 12s 112ms/step - loss: 0.2213 - accuracy: 0.9048 - val_loss: 0.1623 - val_accuracy: 0.9386
In [20]:
# test_loss, test_acc = cnn3.evaluate(X_test, y_test)
# print(f'Test accuracy: {test_acc}')

# test_loss, test_acc = cnn4.evaluate(X_test, y_test)
# print(f'Test accuracy: {test_acc}')
In [21]:
def plot_history(history, title):
    plt.figure(figsize=(12, 6))

    # Plot training & validation accuracy values
    plt.subplot(1, 2, 1)
    plt.plot(history.history['accuracy'])
    plt.plot(history.history['val_accuracy'])
    plt.title(title + ' - Model Accuracy')
    plt.xlabel('Epoch')
    plt.ylabel('Accuracy')
    plt.legend(['Train', 'Test'], loc='upper left')

    # Plot training & validation loss values
    plt.subplot(1, 2, 2)
    plt.plot(history.history['loss'])
    plt.plot(history.history['val_loss'])
    plt.title(title + ' - Model Loss')
    plt.xlabel('Epoch')
    plt.ylabel('Loss')
    plt.legend(['Train', 'Test'], loc='upper left')

    plt.show()

# Visualize the performance of the models
plot_history(history_cnn3, 'Model 3')
plot_history(history_cnn4, 'Model 4')
In [22]:
from scipy.stats import chi2_contingency
import numpy as np

# Assuming you have predictions from cnn1, cnn2, cnn3, cnn4 on the test set
y_pred_cnn1 = np.argmax(cnn1.predict(X_test), axis=1)
y_pred_cnn2 = np.argmax(cnn2.predict(X_test), axis=1)
y_pred_cnn3 = np.argmax(cnn3.predict(X_test), axis=1)
y_pred_cnn4 = np.argmax(cnn4.predict(X_test), axis=1)


# Create contingency tables
table_cnn1_cnn3 = np.array([[np.sum((y_test == i) & (y_pred_cnn1 == i)), np.sum((y_test == i) & (y_pred_cnn3 == i))]
                            for i in range(NUM_CLASSES)])

table_cnn1_cnn4 = np.array([[np.sum((y_test == i) & (y_pred_cnn1 == i)), np.sum((y_test == i) & (y_pred_cnn4 == i))]
                            for i in range(NUM_CLASSES)])

table_cnn2_cnn3 = np.array([[np.sum((y_test == i) & (y_pred_cnn2 == i)), np.sum((y_test == i) & (y_pred_cnn3 == i))]
                            for i in range(NUM_CLASSES)])

table_cnn2_cnn4 = np.array([[np.sum((y_test == i) & (y_pred_cnn2 == i)), np.sum((y_test == i) & (y_pred_cnn4 == i))]
                            for i in range(NUM_CLASSES)])

table_cnn1_cnn2 = np.array([[np.sum((y_pred_cnn1 == i) & (y_pred_cnn2 == i)), np.sum((y_pred_cnn1 != i) & (y_pred_cnn2 == i))]
                            for i in range(NUM_CLASSES)])

table_cnn3_cnn4 = np.array([[np.sum((y_pred_cnn3 == i) & (y_pred_cnn4 == i)), np.sum((y_pred_cnn3 != i) & (y_pred_cnn4 == i))]
                            for i in range(NUM_CLASSES)])

# Perform McNemar's test
_, p_value_cnn1_cnn3, _, _ = chi2_contingency(table_cnn1_cnn3)
_, p_value_cnn1_cnn4, _, _ = chi2_contingency(table_cnn1_cnn4)
_, p_value_cnn2_cnn3, _, _ = chi2_contingency(table_cnn2_cnn3)
_, p_value_cnn2_cnn4, _, _ = chi2_contingency(table_cnn2_cnn4)
_, p_value_cnn1_cnn2, _, _ = chi2_contingency(table_cnn1_cnn2)
_, p_value_cnn3_cnn4, _, _ = chi2_contingency(table_cnn3_cnn4)

# Print p-values
print("McNemar's Test p-values:")
print("cnn1 vs cnn3:", p_value_cnn1_cnn3)
print("cnn1 vs cnn4:", p_value_cnn1_cnn4)
print("cnn2 vs cnn3:", p_value_cnn2_cnn3)
print("cnn2 vs cnn4:", p_value_cnn2_cnn4)
print("cnn1 vs cnn2:", p_value_cnn1_cnn2)
print("cnn3 vs cnn4:", p_value_cnn3_cnn4)
111/111 [==============================] - 0s 3ms/step
111/111 [==============================] - 0s 4ms/step
111/111 [==============================] - 2s 21ms/step
111/111 [==============================] - 1s 10ms/step
McNemar's Test p-values:
cnn1 vs cnn3: 0.11997859537884369
cnn1 vs cnn4: 0.14187201371055405
cnn2 vs cnn3: 0.008135296282217257
cnn2 vs cnn4: 0.010558471879953119
cnn1 vs cnn2: 2.7811886137535967e-10
cnn3 vs cnn4: 0.9215586027896007

When comparing the 4 models based on McNemar's test p-values:

Low p-value (typically below a significance threshold, e.g., 0.05) suggests evidence to reject any significant differences in performance between the compared models.

In the above cases, all the models have p-values greater than 0.05. This suggests their performances are in a comparable space. However, significant difference amongst these models can be observed between cnn2 & cnn3 and cnn2 & cnn4 whereas the minimal differences can be observed between cnn1 & cnn2 and cnn3 & cnn4 (primarily as they follow the same architecture with different parameters). However, as these values are still above the required threshold, we can assume the performances to be in a similar bucket.

In [23]:
from sklearn.metrics import f1_score, confusion_matrix
import seaborn as sns
import matplotlib.pyplot as plt

y_pred_cnn1 = np.argmax(cnn1.predict(X_test), axis=1)
y_pred_cnn2 = np.argmax(cnn2.predict(X_test), axis=1)
y_pred_cnn3 = np.argmax(cnn3.predict(X_test), axis=1)
y_pred_cnn4 = np.argmax(cnn4.predict(X_test), axis=1)

# Compute F1 scores
f1_cnn1 = f1_score(y_test, y_pred_cnn1, average='weighted')
f1_cnn2 = f1_score(y_test, y_pred_cnn2, average='weighted')
f1_cnn3 = f1_score(y_test, y_pred_cnn3, average='weighted')
f1_cnn4 = f1_score(y_test, y_pred_cnn4, average='weighted')

print("F1 Scores:")
print("cnn1:", f1_cnn1)
print("cnn2:", f1_cnn2)
print("cnn3:", f1_cnn3)
print("cnn4:", f1_cnn4)

# Display confusion matrices
def plot_confusion_matrix(y_true, y_pred, model_name):
    cm = confusion_matrix(y_true, y_pred)
    plt.figure(figsize=(5, 5))
    sns.heatmap(cm, annot=True, fmt='d', cmap='Blues', xticklabels=[0, 1], yticklabels=[0, 1])
    plt.title(f'Confusion Matrix - {model_name}')
    plt.xlabel('Predicted')
    plt.ylabel('True')
    plt.show()

plot_confusion_matrix(y_test, y_pred_cnn1, 'cnn1')
plot_confusion_matrix(y_test, y_pred_cnn2, 'cnn2')
plot_confusion_matrix(y_test, y_pred_cnn3, 'cnn3')
plot_confusion_matrix(y_test, y_pred_cnn4, 'cnn4')
111/111 [==============================] - 0s 3ms/step
111/111 [==============================] - 0s 4ms/step
111/111 [==============================] - 3s 24ms/step
111/111 [==============================] - 1s 10ms/step
F1 Scores:
cnn1: 0.8784269244468756
cnn2: 0.8811913557576982
cnn3: 0.9507410848161573
cnn4: 0.9386341623941362

F1 score is our truth metric to determine which is the best performing model. Alongside the F1 score, we would also look at the overall accuracy and the confusion matrix to support our decision to determine the best performing model.

Based on F1 score:

CNN3 model has the best F1 score of 95%. This suggests that it strikes the right balance between Precision and Recall in the dataset. The split here ensures minimial damage in terms of the False Postive and False Negative cases tha exist in the system. When looked up the confusion matrix, CNN3 has the minimal cases for both Fasle Positives and False Negatives which indicates that the model is performing as expected in classifying the vehciles and non-vehicles. Thus, it makes sense that the model has the best accuracy when compared to the other models. It has an accuracy score 95%. All these factors makes it highly functional and accurate to classify the vehicles when deployed on unseen real-world images.

Also, as an additional point, it suggests that having a simple CNN network in our case is more beneficial when compared to the one with more layers and hyper-parameters. This essentially means that the models tends to overfit and overlearn when given a complex network. It tends to memorizes the training set and predict the unseen test data based on that. This suggests that given the nature of this problem, having a simpler network with the required hyper-parameters can perform much better in identifying the vehicles from the give set of data.

Standard Multi-layer Perceptron¶

In [24]:
#MLP

y_train_ohe = keras.utils.to_categorical(y_train, NUM_CLASSES)
y_test_ohe = keras.utils.to_categorical(y_test, NUM_CLASSES)

# make a 3 layer keras MLP
mlp = Sequential()
mlp.add( Flatten() ) # make images flat for the MLP input
mlp.add( Dense(input_dim=1, units=30, 
               activation='relu') )
mlp.add( Dense(units=15, activation='relu') )
mlp.add( Dense(NUM_CLASSES) )
mlp.add( Activation('softmax') )

mlp.compile(loss='mean_squared_error',
              optimizer='rmsprop',
              metrics=['accuracy'])

mlp.fit(X_train, y_train_ohe, 
        batch_size=32, epochs=150, 
        shuffle=True, verbose=0)
Out[24]:
<keras.src.callbacks.History at 0x2917be290>
In [25]:
from sklearn import metrics as mt
from matplotlib import pyplot as plt
import seaborn as sns
%matplotlib inline

def compare_mlp_cnn(cnn, mlp, X_test, y_test, labels='auto'):
    plt.figure(figsize=(15,5))
    if cnn is not None:
        yhat_cnn = np.argmax(cnn.predict(X_test), axis=1)
        acc_cnn = mt.accuracy_score(y_test,yhat_cnn)
        plt.subplot(1,2,1)
        cm = mt.confusion_matrix(y_test,yhat_cnn)
        cm = cm/np.sum(cm,axis=1)[:,np.newaxis]
        sns.heatmap(cm, annot=True, fmt='.2f',xticklabels=labels,yticklabels=labels)
        plt.title(f'CNN: {acc_cnn:.4f}')
    
    if mlp is not None:
        yhat_mlp = np.argmax(mlp.predict(X_test), axis=1)
        acc_mlp = mt.accuracy_score(y_test,yhat_mlp)
        plt.subplot(1,2,2)
        cm = mt.confusion_matrix(y_test,yhat_mlp)
        cm = cm/np.sum(cm,axis=1)[:,np.newaxis]
        sns.heatmap(cm,annot=True, fmt='.2f',xticklabels=labels,yticklabels=labels)
        plt.title(f'MLP: {acc_mlp:.4f}')
In [26]:
## Compare CNN3 & MLP
compare_mlp_cnn(cnn3,mlp,X_test,y_test)
111/111 [==============================] - 2s 22ms/step
111/111 [==============================] - 0s 381us/step
In [27]:
from sklearn.metrics import roc_curve, roc_auc_score
#from pycm import DeLong
import matplotlib.pyplot as plt

# Assuming model_mlp and model_cnn3 are your trained models
y_pred_mlp = mlp.predict(X_test)
fpr_mlp, tpr_mlp, _ = roc_curve(y_test, y_pred_mlp[:, 1])
roc_auc_mlp = roc_auc_score(y_test, y_pred_mlp[:, 1])

y_pred_cnn3 = cnn3.predict(X_test)
fpr_cnn3, tpr_cnn3, _ = roc_curve(y_test, y_pred_cnn3[:, 1])
roc_auc_cnn3 = roc_auc_score(y_test, y_pred_cnn3[:, 1])

# Plot ROC curves
plt.figure(figsize=(8, 8))
plt.plot(fpr_mlp, tpr_mlp, label=f'MLP (AUC = {roc_auc_mlp:.2f})')
plt.plot(fpr_cnn3, tpr_cnn3, label=f'CNN3 (AUC = {roc_auc_cnn3:.2f})')
plt.plot([0, 1], [0, 1], linestyle='--', color='gray', label='Random')
plt.xlabel('False Positive Rate')
plt.ylabel('True Positive Rate')
plt.title('Receiver Operating Characteristic (ROC) Curve')
plt.legend()
plt.show()
111/111 [==============================] - 0s 380us/step
111/111 [==============================] - 2s 20ms/step
In [28]:
from sklearn.metrics import roc_curve, roc_auc_score
import numpy as np
from scipy.stats import norm


# Assuming model_mlp and model_cnn3 are your trained models
y_pred_mlp = mlp.predict(X_test)[:, 1]
fpr_mlp, tpr_mlp, _ = roc_curve(y_test, y_pred_mlp)
roc_auc_mlp = roc_auc_score(y_test, y_pred_mlp)

y_pred_cnn3 = cnn3.predict(X_test)[:, 1]
fpr_cnn3, tpr_cnn3, _ = roc_curve(y_test, y_pred_cnn3)
roc_auc_cnn3 = roc_auc_score(y_test, y_pred_cnn3)

# Compute the z statistic for the difference in AUCs
z_statistic = (roc_auc_mlp - roc_auc_cnn3) / np.sqrt((1 / (2 * len(y_test))))

# Compute the p-value
p_value = 2 * (1 - norm.cdf(abs(z_statistic)))

# Check for statistical significance
if p_value < 0.05:
    print(f'Statistically significant difference (p-value = {p_value:.4f})')
else:
    print(f'No statistically significant difference (p-value = {p_value:.4f})')
111/111 [==============================] - 0s 388us/step
111/111 [==============================] - 2s 22ms/step
No statistically significant difference (p-value = 0.4623)
In [34]:
y_pred_mlp = np.argmax(mlp.predict(X_test), axis=1)
f1_mlp = f1_score(y_test, y_pred_mlp, average='weighted')
f1_mlp
111/111 [==============================] - 0s 389us/step
Out[34]:
0.9479212419876087

We compared the performance of the the CNN3 model and MLP based on the following factors:

  1. Accuracy and Confusion Matrix Accuracy for CNN3 is around 95% and MLP is around 94.7%.
  2. F1-Scores: F1 score for CNN3 is around 95% whereas for MLP is around 94.7%.
  3. ROC: Both CNN3 and MLP have a similar trajectory in discriminating rightly between True positive and False Positives. CNN3 had a slight better performance when compared to MLP.
  4. p-values The p-value based on the Area Under the Curve (AUC) for the ROC curves indicates that there is no significant difference between the models in terms of their discriminatory power.

Based on the accuracy scores, F1 scores and ROC trajectory, we can understand that both the models perform almost equally well in classifying the images. The non-significant p-value based on AUC reinforces the idea that there is no substantial difference in the discriminatory performance of the models. This is an important result, as it suggests that both models are equally effective in distinguishing between positive and negative instances.

What it means? For this particular use case, given the nature of the dataset, both MLP and CNN are able to express their views in classifying the dataset pretty well. Also, adding more complexity to the networks may not be particularly benefical in this particular scenario. However, there is always room for more experimentation and playing around with the hyperparameters and layers in the network to keep improving the capability of the model.

Transfer Learning : (Pre-trained model - VGG16)¶

In [29]:
import numpy as np
from tensorflow.keras.applications import VGG16
from tensorflow.keras.models import Sequential, Model
from tensorflow.keras.layers import Dense, Flatten, Dropout, Input
from tensorflow.keras.optimizers import Adam
from tensorflow.keras.utils import to_categorical
from sklearn.model_selection import train_test_split

# Assuming X_train, y_train, X_test, y_test are your datasets
# Preprocess your data if needed

# Convert labels to one-hot encoding
NUM_CLASSES = 2
y_ohe = to_categorical(y, NUM_CLASSES)

# Split the data
X_train, X_val, y_train_ohe, y_val_ohe = train_test_split(X, y_ohe, test_size=0.2, random_state=42)

# Replicate single-channel grayscale image into three channels
X_train_rgb = np.repeat(X_train, 3, axis=-1)
X_val_rgb = np.repeat(X_val, 3, axis=-1)
X_test_rgb = np.repeat(X_test, 3, axis=-1)

# Load the pre-trained VGG16 model without the top (fully connected) layers
base_model = VGG16(weights='imagenet', include_top=False, input_shape=(64, 64, 3))

# Create a new model on top using the Functional API
input_layer = Input(shape=(64, 64, 3))
x = base_model(input_layer)
x = Flatten()(x)
x = Dense(128, activation='relu')(x)
x = Dropout(0.5)(x)
output_layer = Dense(NUM_CLASSES, activation='softmax')(x)

model = Model(inputs=input_layer, outputs=output_layer)

# Compile the model
optimizer = Adam(learning_rate=0.0001)
model.compile(optimizer=optimizer, loss='categorical_crossentropy', metrics=['accuracy'])

# Train the model
history_transfer_learning = model.fit(
    X_train_rgb, y_train_ohe,
    epochs=4,
    validation_data=(X_val_rgb, y_val_ohe)
)

# Evaluate the model
test_loss, test_acc = model.evaluate(X_test_rgb, y_test_ohe)
print(f'Test accuracy with transfer learning (VGG16): {test_acc}')
WARNING:absl:At this time, the v2.11+ optimizer `tf.keras.optimizers.Adam` runs slowly on M1/M2 Macs, please use the legacy Keras optimizer instead, located at `tf.keras.optimizers.legacy.Adam`.
Epoch 1/4
444/444 [==============================] - 992s 2s/step - loss: 0.1699 - accuracy: 0.9343 - val_loss: 0.1209 - val_accuracy: 0.9527
Epoch 2/4
444/444 [==============================] - 450s 1s/step - loss: 0.0866 - accuracy: 0.9683 - val_loss: 0.2276 - val_accuracy: 0.9223
Epoch 3/4
444/444 [==============================] - 480s 1s/step - loss: 0.0685 - accuracy: 0.9747 - val_loss: 0.0621 - val_accuracy: 0.9780
Epoch 4/4
444/444 [==============================] - 476s 1s/step - loss: 0.0539 - accuracy: 0.9811 - val_loss: 0.0515 - val_accuracy: 0.9831
111/111 [==============================] - 30s 270ms/step - loss: 0.0515 - accuracy: 0.9831
Test accuracy with transfer learning (VGG16): 0.9831081032752991
In [30]:
from sklearn.metrics import confusion_matrix, classification_report, f1_score
import seaborn as sns
import matplotlib.pyplot as plt

# Predict on the test set
y_pred = model.predict(X_test_rgb)
y_pred_classes = np.argmax(y_pred, axis=1)
y_true_classes = np.argmax(y_test_ohe, axis=1)

# Confusion Matrix
conf_matrix = confusion_matrix(y_true_classes, y_pred_classes)

# Plot the confusion matrix
plt.figure(figsize=(8, 6))
sns.heatmap(conf_matrix, annot=True, fmt='d', cmap='Blues', cbar=False,
            xticklabels=['Non-Vehicle', 'Vehicle'],
            yticklabels=['Non-Vehicle', 'Vehicle'])
plt.title('Confusion Matrix')
plt.xlabel('Predicted')
plt.ylabel('True')
plt.show()

# Classification Report
class_report = classification_report(y_true_classes, y_pred_classes)
print("Classification Report:\n", class_report)

# F1 Score
f1 = f1_score(y_true_classes, y_pred_classes, average='binary')
print(f'F1 Score: {f1:.4f}')
111/111 [==============================] - 33s 295ms/step
Classification Report:
               precision    recall  f1-score   support

           0       0.99      0.98      0.98      1842
           1       0.98      0.99      0.98      1710

    accuracy                           0.98      3552
   macro avg       0.98      0.98      0.98      3552
weighted avg       0.98      0.98      0.98      3552

F1 Score: 0.9825

Comparing CNN3 with pre-trained model.¶

In [31]:
print('Pre-Trained Model (VGG16) - F1 Score:', f1)
print('CNN3 - F1 Score:', f1_cnn3)
Pre-Trained Model (VGG16) - F1 Score: 0.9825072886297377
CNN3 - F1 Score: 0.9507410848161573

When comparing the F1 scores for CNN3 with the one for VGG16: we can see that VGG16 outperforms CNN3. It is able to clasify the vehicles better by good percentage though not too huge. Referring to the Confusion Matrix, we can observe that the number of False Positive have seen a good dip in the VGG16. But, for False Negatives, there are quite close. Overall, VGG16 has enhanced the classification of the vehicles. The pre-trained features, increased model capacity, knowledge transfer, generalization capabilities, and other advantages associated with using a well-established, pre-trained model have acted in favour of VGG16 model. It tends to learn to understand and and generalize the unseen data better when compared to our custome build CNN network.

References:¶

[Kaggle]. Vehicle Image Detection Dataset. Retrieved from https://www.kaggle.com/datasets/brsdincer/vehicle-detection-image-set